paul_marcianofandomcom-20200215-history
Unicode block
A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes. Typically, proposals such as the addition of new glyphs are discussed and evaluated by considering the relevant block or blocks as a whole. Each block is generally, but not always, meant to include all the glyphs used by one or more specific languages, or in some general application area such as mathematics, surveying, decorative typesetting, social forums, etc. Unicode blocks are identified by unique names, which use only ASCII characters and are usually descriptive of the nature of the symbols, in English; such as "Tibetan" or "Supplemental Arrows-A". (When comparing block names, one is supposed to equate uppercase with lowercase letters, and ignore any whitespace, hyphens, and underbars; so the last name is equivalent to "supplemental_arrows__a" and "SUPPLEMENTALARROWSA"."Unicode Blocks data file - Unicode version 12.0". Unicode Consortium. Accessed on 2019-04-29.) Blocks are pairwise disjoint, that is, they do not overlap. The starting code point and the size (number of code points) of each block are always multiples of 16; therefore, in the hexadecimal notation, the starting (smallest) point is U+''xxx''0 and the ending (largest) point is U+''yyy''F, where 'xxx and yyy are three or more hexadecimal digits. (These constraints are intended to simplify the display of glyphs in Unicode Consortium documents, as tables with 16 columns labeled with the last hexadecimal digit of the code point.) The size of a block may range from the minimum of 16 to a maximum of 65,536 code points. Every assigned code point has a glyph property called "Block", whose value is a character string naming the unique block that owns that point.Unicode glossary However, a block may also contain unassigned code points, usually reserved for future additions of characters that "logically" should belong to that block. Code points not belonging to any of the named blocks, e.g. in the unassigned planes 3–13, have the value block="No_block". Each Unicode point also has a property called "General Category", that attempts to describes the role of the corresponding symbol in the languages or applications for whose sake it was included in the system. Examples of General Categories are "Lu" (meaning upper-case letter), "Nd" (decimal digit), "Pi" (open-quote punctuation), and "Mn" (non-spacing mark, i.e. a diacritic for the preceding glyph). This division is completely independent of code blocks: the code points with a given General Category generally span many blocks, and do not have to be consecutive, not even within each block."Unicode Core Specification - version 12.0 - Chapter 4: Character Properties" Accessed on 2019-04-29. In descriptions of the Unicode system, a block may be subdivided into more specific subgroups, such as the "Chess symbols" in the block "Miscellaneous symbols". Those subgroups are not "blocks" in the technical sense used by the Unicode consortium, and are named only for the convenience of users. Unicode 12.1 defines 300 blocks:Unicode 12.1.0 UCD * 163 in plane 0, the Basic Multilingual Plane (BMP) * 127 in plane 1, the Supplementary Multilingual Plane (SMP) * 6 in plane 2, the Supplementary Ideographic Plane (SIP) * 2 in plane 14 (E in hexadecimal), the Supplementary Special-purpose Plane (SSP) * One each in planes 15 (Fhex) and 16 (10hex), called Supplementary Private Use Area-A and -B References External links *Official web site of the Unicode Consortium (English) Category:Unicode blocks